Hive窗口函数之preceding and following

您所在的位置:网站首页 数据库range between Hive窗口函数之preceding and following

Hive窗口函数之preceding and following

2024-07-17 08:35| 来源: 网络整理| 查看: 265

最近发现两个特别实用的Hive函数。preceding:向前    following:向后,这两个窗口函数可以实现滑窗求和(指定rows范围)或者指定范围内数据求和(指定range范围)

学习链接:Hive窗口函数之preceding and following,试运行了下,结果如下:

1.数据源: select * from stu_score order by score;

2.函数使用: select name, score, sum(score) over(order by score range between 2 preceding and 2 following) s1, -- 当前行的score值加减2的范围内的所有行 sum(score) over(order by score rows between 2 preceding and 2 following) s2, -- 当前行+前后2行,一共5行 sum(score) over(order by score range between unbounded preceding and unbounded following) s3, -- 全部行,不做限制 sum(score) over(order by score rows between unbounded preceding and unbounded following) s4, -- 全部行,不做限制 sum(score) over(order by score) s5, -- 第一行到当前行(和当前行相同score值的所有行都会包含进去) sum(score) over(order by score rows between unbounded preceding and current row) s6, -- 第一行到当前行(和当前行相同score值的其他行不会包含进去,这是和上面的区别) sum(score) over(order by score rows between 3 preceding and current row) s7, -- 当前行+往前3行 sum(score) over(order by score rows between 3 preceding and 1 following) s8, --当前行+往前3行+往后1行 sum(score) over(order by score rows between current row and unbounded following) s9 --当前行+往后所有行 from stu_score order by score;

3.验证数据:

s2~s9都是按照对应逻辑计算的,s1的好像不是,有点像是group by之后的求和。

select name, score, sum(score) over(order by score range between 0 preceding and 10 following) s1 - from stu_score_1 order by score;

 补充: 下述代码主要是求每个团长近七天的用户数,当然只有6月30号的那一天数据是准确的,其他的日期下数据都会偏少。

--每个团长近七天数据 select dt, leader_uid, sum(user_cnt) over(partition by leader_uid order by dt asc rows 6 preceding) as pre_7d_user_num, sum(user_cnt) over(partition by leader_uid order by dt asc rows between 6 preceding and current row) pre_7d_user_num from leader_data_di where dt = date_sub('2021-06-30', 6) order by dt desc limit 10;

上述代码里的

   sum(user_cnt) over(partition by leader_uid order by dt asc rows 6 preceding) as pre_7d_user_num,    sum(user_cnt) over(partition by leader_uid order by dt asc rows between 6 preceding and current row) pre_7d_user_num

效果是一样的。 



【本文地址】


今日新闻


推荐新闻


    CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3